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Abstract 

A version of the secretary problem is considered. The ranks of items, whose val- 
ues are independent, identically distributed random variables Xi,X2, ■ ■ ■ ,Xn from 
a uniform distribution on [0; 1], are observed sequentially by the grader. He has to 
select exactly one item, when it appears, and receives a payoff which is a function 
of the unobserved realization of random variable assigned to the item diminished 
by some cost. The methods of analysis are based on the existence of an embedded 
Markov chain and use the technique of backward induction. The result is a gen- 
eralization of the selection model considered by Bearden (2006). The asymptotic 
behaviour of the solution is also investigated. 
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1 Introduction 



Although a version of the secretary problem (the bea uty cori t est pr oblem, the dowry 
problem or the marriage problem) was first solved by ICayleyl fll875l ). it was not until 
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five decade s ago the r e had been sudden resurgence of interest in this problem. Since the 
articles by [Gardner fll960al Jbl) th e secretary problem h a s been extended and generalized 
in many different directions by [Gilbert and Mostellerl (119661 ). Excellent reviews of th e 
developrnent of t his colourful pro ble m and its extensio ns have been given by lRosd (Il982bl ). 



FreemanI (119831 ) , ISamueld (Il99ll ) and iFerguson 



(119891). The cla ssical secretary problem in 
Ferguson] (119891 ). He defined the secretary 



its simplest form can be formulated following 
problem in its standard form to have the following features: 

(i) There is only one secretarial position available. 

(ii) The number of applicants, N, is known in advance. 

(iii) The applicants are interviewed sequentially in a random order. 

(iv) All the applicants can be ranked from the best to the worst without any ties. Further, 
the decision to accept or to reject an applicant must be based solely on the relative 
ranks of the interviewed applicants. 

(v) An applicant once rejected cannot be recalled later. The employer is satisfied with 
nothing but the very best. 

(vi) The payoff is 1 if the best of the applicants is chosen and otherwise. 

This model can be used as a model of choice in many decision s in eve r yday life, such as 
buying a car, hiring an employee, or finding an apartment (see Corbin ( 19801 )). The part 
of research has been devoted to modified version of the problem where some important 
assumption of the model has been changed to fit it to the real life context. There are 
analysis of decis ion maker's a i ms. It could be that he wi ll be sa tisfied by chosing one of 



the K best (see Gusein-Zade (1966), Frank and Samuels ( 1980l )). It was shown that the 



optimal strategy in this problem has very simple threshold form. The items are observed 
and rejected up to some moments jr (thresholds) after which it is optimal to accept the 
first candidate with relative rank r, r = 1, 2, . . . , A'. The thresholds jr are decreasing on 
r. This strategy is rather intuitive. When the candidates run low we admit acceptance 
the lowest rank of chosen item. If the aim is to choose th e second best item then t he form 
of the optim al strategy is not so intuitively obvious (see ISzajowskil (119821 ). iRosd (Il982a[ ) 



Moril(ll985f )l In the same time the possibility o f backward solicitation and uncer t ain em - 



ployment was also investigated (see lYangI (119741 ). ISmith and Deelyl ( 119751 ) . ISmithI (119751 )). 



There are als o experimental resear c h with subje cts confronted with the classical secretary 



problem (see ISeale and RapoportI (Il997l . l2000h l The optimal strategy of the grader in 



the classical secretary problem is to pass A;^ — 1 applicants, where = [A^e~^] and 
stop at the first j > which is better that those seen so fa r. If none exists nothing is 
chosen. The experimental study by lSeale and RapoportI (119971 ) of this problem shows that 
subjects under study ha ve tendency to terminate their search earlier than in the optimal 
strategy. iBeardenI (120061 ) has considered application the best choice problem to the model 
of choice for the trader who makes her selling decision at each point in time solely on the 
basis of the rank of the current price with respect to the previous prices, but, ultimately, 
derive utility from the true value of the selected observation and not from its rank. The 
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assumption (vi) is not fulfilled in this case. iBeardenI fl2006h has made efforts to explain 
this effect and the new payoff scheme has proposed. He shows that if the true values Xj 
are i.i.d. uniformly distributed on [0, 1] then for every N the optimal strategy is to pass 
c — 1 applicants, and stop with the first j > c with rank 1. If none exists, stop at time N. 
The optimal value of c is either [-\/iV J or \^/N ] . 

This payoff sch eme when the i . i.d. ^ /s come from other than the uniform distribution has 
been studied by lSamuel-CahnI (120051 ). Three different families of distributions, belonging to 
the three different domains of attraction for the maximum, have been considered and the 
dependence of the optimal strategy and the optimal expected payoff has been investigated. 
The different distributions can model various tendency in perception of the searched items. 



In this paper the idea of payoff function dependent on the true value of the item is modified 
to include the different personal costs of choice of the item. The cost of observation in 
the secretary problem with payoffs depen dent on th e rea l rank s has been investigated 
by iBartoszynski and Govindarajulu fll978l ) (see also |^ fll998h ). However, the cost of 
decision is different problem than the cost of observation. It will be shown that the optimal 
number of items one should skip is a function of this personal cost. At the last moment 
the payoff function can be slightly differently defined than in iBeardenI feood Vs paper. 
The asymptotic expected return and asymptotic behaviour of the optimal strategy will 
be studied. 



The organization of the paper are as follows. In Section [2] the related to the secre- 
tary problem Markov chain is formu l ated. This section is base d mainly o n the sug- 
gestion from iDvnkin and YushkevichI (119691 ) and the results by ISzajowskil (119821 ) and 
Suchwalko and Szajowskil (120021 ) . In the next sections the solution of the rank-based sec- 
retary problem with cardinal payoff and the personal cost of grader is given. In Section [31 
the exact and asymptotic solution is provided for the model formulated in Section [2l In 
this consideration the asymptotic behaviour of the threshold defining the optimal strategy 
of the grader is studied. 



In the last section the comparison of obtained results are given. 



2 Mathematical formulation of the model 



Let us assume that the grader observes a sequence of up to applicants whose values 
are i.i.d. random variables {Xi, X2, . . . , X^} with uniform distribution on E = [0, 1]. The 
values of the applicants are not observed. Let us define 

Rk = #{l<t<k: X,< Xk}. 
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The random variable Rk is called relative rank of k-th candidate with respect of items 
investigated to the moment k. The grader can see the relative ranks instead of the true 
values. All random variables are defined on a fixed probability space P). The ob- 

servations of random variables Rk, k = 1,2, . . . , N, generate the sequence of cr-fields 
J^k = o'iRi, R2, ■ ■ ■ , Rk}, k E T = {1,2, . . . , N}. The random variables Rk are indepen- 
dent and P{Rk = = p 

Denote by DJl^ the set of all Markov moments r with respect to cr-fields {J^k}k=i- Let 
g : T X § X E ^ 3?"*" be the gain function. Define 

(1) vn = sup Eq{T,Rr,Xr). 

We are looking for r* e 971^ such that Eg(r*, Rr*,Xr*) = vn- 

Since {q{n, i?„, Xn)}n=i is not adapted to the filtration {JF„},^^^, the gain function can be 
substituted by the conditional expectation of the sequence with respect to the filtration 
given. By property of the conditional expectation we have 



N 

Eg(r, R,, X.) = E / ^('^' X^)dP 

r=l "'{T=r-} 
N 



r=l 

where 



J2 f E[q{r,Rr,Xr)\J'r]dP = Eg{T,Rr), 

1 J {T=r} 



(2) ~g{r,Rr) = E[q{r,Rr,Xr)\J'r] 

for r = 1, 2, . . . , A^. On the event {uj : R^ = s} we have g{r, s) = E[g(r, i?^, Xr)\Rr = s]. 

Assumption 1 In the sequel it is assumed that the grader wants to accept the best so far 
applicant. 

The function g{r,s) defined in ([2]) is equal to for s > 1 and non-negative for s = 1. 
It means that we can choose the required item at moments r only if i?^ = 1- Denote 
h{r) = g{r, 1). 

The risk is connected with each decision of the grader. The personal feelings of the risk 
are different. When the decision process is dynamic we can assume that the feeling of risk 
appears randomly at some moment ^. Its distribution is a model of concern for correct 
choice of applicant. 

Assumption 2 R is assumed that ^ has uniform distribution on {0, 1, ... , A^}. 
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Remark 2.1 Let us assume that the cost of choice or the measure of stress related to the 
decision of acceptance of the applicant is c. It appears when the decision is after ^ and its 
measure will be random process C{t) = cl{^>j}. Based on the observed process of relative 
ranks and assuming that there are no acceptance before k we have 

(3) c{k,t)=E[C{t)\J^,] = c ^2lX\ - 

The applied model is a consequence of observation that the fear of the wrong decision today 
is highest than the concern for the consequence of the future decision. 

Assumption 3 The aim of the grader is to maximize the expected value of applicant 
chosen and at the same time to minimize the cost of choice. 



In this case the function 



(4) q{t,Rt,Xt)=g,{t,Rt,Xt) 



{Xt-Cit))^n,=i}{Rt) iit<N, 
Xjv — c otherwise. 



Since are i.i.d. random variables with the uniform distribution on [0, 1] we have for 
t > r 



(5) Ur,t,Rt)=E[g,{t,Rt,Xt)\J^r] 

, t N-t + 1^^ 



see 



Resnicd ( ll987l )). Let us denote h{r, s) = g{r, s, 1] 



Define Wq = 1, 7j = inf{r > •jt-i : ^ = 1} (inf = oo) and Wt = 7*. If jt = oo, then 
define Wt = oo. Wt is the Markov chain with following one step transition probabilities 

ifr=l, s = 2, 

(6) p{r, s) = P{Wt+^ = is, l)\Wt = (r, 1)} = { if 1< r < 

0, if r > s or r = 1, s 7^ 2, 

with p{oo, oo) = 1, p(r, oo) = 1 - Es=r+iP{r, s). Let Gt = a{Wi,W2, . . . ,Wt} and 

be the set of stopping^times with respect to {Qt}^i- Since 7f is increasing, then we can 

define M^^-^ = {a e 971^ : 7^ > r}. 

Let Pr(") be probability measure related to the Markov chain Wt, with trajectory starting 
in state r and Er(-) the expected value with respect to P(r,i)(-). From we can see that 
the transition probabilities depend on moments r where items with relative rank 1 appears. 
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Taking into account the form of the payoff function ([5]) the two dimensional Markov chain 
should be considered. Denote : f2 — > T x T the Markov chain with the following one 
step transition probabilities 

(7a) P(Z,+i = {s,j)\Zt = {s,i)) = ' for s < 2 < J < iV, 

JU - 1) 

(7b) P{Zt+i = {k,i)\Zt = {s,i)) = j^j^ZT) iois<k<i<N, 

and otherwise. 

Let us introduce the operators based on ([7]) and Qj 

,8a) TMr, s) ^ E,.,MZ.) ^ -^{Hr.,) ^ (l " JJ^) i\ " 4 

(8b, THr) . ^MW, ^ -^^Ur,) . (l - J^] " 0- 



3 The cost of fear in the rank-based secretary problem with cardinal value 
of the item. 



hetm^ = {t em^ : r < T < N} and vn^t) = sup^g^^ Ec/c(r, Rr,Xr). The following 
algorithm allows to construct the value of the problem v^. Let 

(9) vn{N) = Eg,{N, Rn, X^) = E{X^) - c. 

and for r < N 

(10a) W]\f{r,s) = max{h{r, s),Twj\f{r, s)}, 

(10b) fAr(r) = max{/i(r), Tt;Ar(r)}. 

One can consider the stopping sets 

(11) r, = {(r, s) : /i(r, s) > WN{r, s), r < s} U {(r, A^)}, 

r G T. In class of such stopping sets there are solutions of restricted problem. Based 
on this partial solution the optimal stopping time is constructed and it is shown that 

vn = vn{1)- 

Lemma 3.1 For the considered problem with the payoff function and c G 3?^*", there 
is ko such that for r > k^ the optimal stopping time r* in DJt^ has a form t* = mi{s > 
r : Ys = 1} A N i.e. the stopping set is Tr = {(r, s) : s > r,Yr = 1} U {(r, A^)}. 
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Proof. The function h{k, r) = — c^—^^ is increasing on r > A;. For r = N we 
have WN{k, N) = ^ — c. Let us construct the one step look ahead stopping time and let us 
define ko = min{l < k < N : h{s) > Th{s) for every s e [k, N]. For j > k > ko we have 
h{k) < h{j) < h{k,j) and by definition of ko we have h{k,j) > h{k) > Th{k) > Th{k,j). 
The value of the problem WN{k,r) = h{k,r) and the optimal stopping time on 93T^ is 
defined by the stopping set F^g. Therefore we have Tv^ir) = Th{r) for r > k^ and the 

N 



one step look ahead rule is optimal in 9Jt^ 



Remark 3.2 Let us assume that s > k > ko. We take limits of ^ y and x as 

— >• oo. We get 



~ 1 — X 
h{y,x)= lim h{k,s) = l — c 

N^oo 

h(y,x)= lim Th(k,s) = l cx c log(x). 

2 1-y 1-y 



For c G (0, +oo) the equation log(y) = {y — + 1) has one root a G (0,1). When 

X > y > a then h{y, x) < h{y, x) . 

The optimal stopping time r* is defined as follows: one have to stop at the first moment 
r when 1^ = 1, unless vn{t) > h{r). We can define the stopping set F = {r : h{r) > 
VN{r)}U{N}. 

Theorem 3.3 For every c G [0, +oo) there is ko such that T = {r : r > ko,Yr = 1}U {N} 
and vn = VN^ko — 1). 

Proof. The function h{r) = ;q5^ — c is increasing on r. For r = N we have vn{N) = i — 
c. Let us construct the one step look ahead stopping time and let us define ko = min{l < 
k < N : h{s) > Th{s) for every s G [k, N]. For j > k > ko we have h{k) < h{j) < h{k,j) 
and by definition of ko the value of the problem on is equal to viy{ko—l) = Th{ko—l) 

and the one step look ahead rule is optimal in this set of stopping times. For r < ko — 1 
we have h{r) < v^iko — 1). If we do not stop at the moment r < ko — I we get 
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^=0-1 ^ 

11 r 
= rvN{ko - 1)( -) + -Vjyiko - 1) = VN^ko - 1). 

T ruQ — 1 /Cq — 1 

It shows that vn = VN{ko — 1) and the stopping rule r* = min{l <r<N — 1: r> 
ko, Rr = 1} y N is optimal. 

■ 

Table 1 



Optimal strategy and expected payoff according Theorem 13.31 and 13. 4i 



N 


Cost of decision 




c = 


c = 


1 

" 10 


c = 


2 

" To 


5 


2 


i = 0.65 


2 


|§ ^ 0.571667 


2 


^ ^ 0.466667 


10 


3 


^ 0.733333 


3 


0.654224 


4 


0.566339 


15 


4 


§ ^ 0.775 


4 


0.69564 


5 


0.608834 


50 


7 


0.868571 


8 


0.785822 


9 


0.70274 


100 


10 


0.905446 


12 


0.819826 


14 


0.734604 


oo 





1 


[0.00251646iV] 


0.9 


[0.0340152iV] 


0.8 



Let the number of applicants be going to the infinity. When the cost c is positive the value 
of the problem has limit less than 1 and the asymptotic threshold is bigger than 0. 

Theorem 3.4 Let us assume that c G (0,+cxd). We have 

1 ccx 

(12) lim Vn = I - c - {c+ -)a - log(Q;) 

to^^ 2 I - a 

N " 

TV^oo 

and a is the unique solution of the equation log(x) = (1 + — 1) in (0, 1). 
Proof. It is a consequence of Theorem 13.31 and the observation from Remark 13.21 

■ 

Remark 3.5 It is also natural payoff structure when at the last moment N there are no 
cost of decision and c G [0, |). In this case the decision maker will hesitate longer before 
he accepts the candidate than in the model with cost of decision at the last moment. A 
numerical example is given in Table [H The form of optimal strategy is the same. The 
threshold k^ is different. Its limit ^ ^ /3 fulfills the equation log(a;) = ^(x — 1). 
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Table 2 

Optimal strategy and expected payoff when there is no cost at last moment. 



N 


Cost of decision 




c = 


^ 10 


^ 10 


5 


2 


i = 0.65 


3 


f ^ 0.6 


3 


0.566667 


10 


3 


11 ^ 0.73333 


4 


0.679003 


5 


0.626485 


15 


4 


|1 ^ 0.775 


5 


0.716322 


6 


0.662696 


50 


7 


0.868571 


9 


0.799919 


14 


0.729829 


100 


10 


0.905446 


14 


0.830076 


22 


0.755734 


oo 





1 


[0.00697715Af] 


0.9 


[0.107355iV] 


0.8 



4 Final remarks 



The cost of decision included in this model gives parameter to measure the fear of grader 
that his decision is too early. One can also imagine that the grader is able to observe the 
true value of the item over some fixed threshold, the level of the price acceptable by him. 
In this case, the val ue of the threshold determine th e expected number of observation to 
the acceptance (see iPorosinski and Szajowskil (120001 )). Such partial observation is easy to 



realize by human being and it is natural behaviour for many traders. They do not accept 
prices belove some threshold. 



In many real problems one can observe that the decision maker hesitates to long and 
postpones the final decision. He rejects relatively best option too long. It looks that he 
fears to loss the potential options. The level of fear can be dependent on the value of 
the item or independent. The model of choice for such decision maker could be based 



on th e multicriteria optimal stopp ing models con sidered bv iGnediru (Il98l[) . iFerguson 



( 1992 ). Samuels and Chotlos ( 1986 ) and recently by Sakaguchi and Szajowskil ( 2000l ) and 



Bear den et al.l (120051 ). In this model the one variable is related to the value or rank of 
the applicant being searched. The second coordinate would be a measure of undefined 
risk related to the decision process which the decision maker is feeling. From this point 
of view the research is needed to adopt the proper model for the considered case of the 
item selection. It also open the theoretical investigation to formulate variation of the best 
choice selection. 
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